26  ANOVA

ANOVA, which stands for Analysis of Variance, is a statistical technique used to determine if there are any statistically significant differences between the means of three or more independent (unrelated) groups. It tests the hypothesis that the means of several groups are equal, and it does this by comparing the variance (spread) of scores among the groups to the variance within each group. The primary goal of ANOVA is to uncover whether there is a difference among group means, rather than determining which specific groups are different from each other.

26.1 Types of ANOVA

  1. One-Way ANOVA: Also known as single-factor ANOVA, it assesses the impact of a single factor (independent variable) on a continuous outcome variable. It compares the means across two or more groups. For example, testing the effect of different diets on weight loss.

  2. Two-Way ANOVA: This extends the one-way by not only looking at the impact of one, but two factors simultaneously on a continuous outcome. It can also evaluate the interaction effect between the two factors. For example, studying the effect of diet and exercise on weight loss.

  3. Repeated Measures ANOVA: Used when the same subjects are used for each treatment (e.g., measuring student performance at different times of the year).

  4. Multivariate Analysis of Variance (MANOVA): MANOVA is an extension of ANOVA when there are two or more dependent variables.

26.2 Assumptions of ANOVA

ANOVA relies on several assumptions about the data:

  • Independence of Cases: The groups compared must be composed of different individuals, with no individual being in more than one group.
  • Normality: The distribution of the residuals (differences between observed and predicted values) should follow a normal distribution.
  • Homogeneity of Variances: The variance among the groups should be approximately equal. This can be tested using Levene’s Test or Bartlett’s Test.

26.3 ANOVA Formula

The basic formula for ANOVA is centered around the calculation of two types of variances: within-group variance and between-group variance. The F-statistic is calculated by dividing the variance between the groups by the variance within the groups:

\[F = \frac{\text{Variance between groups}}{\text{Variance within groups}}\]

26.3.1 Steps to Conduct ANOVA

  1. State the Hypothesis:

    • Null hypothesis (H0): The means of the different groups are equal.
    • Alternative hypothesis (Ha): At least one group mean is different from the others.
  2. Calculate ANOVA: Determine the F-statistic using the ANOVA formula, which involves calculating the between-group variance and the within-group variance.

  3. Compare to Critical Value: Compare the calculated F-value to a critical value obtained from an F-distribution table, considering the degrees of freedom for the numerator (between-group variance) and the denominator (within-group variance) and the significance level (alpha, usually set at 0.05).

  4. Make a Decision: If the F-value is greater than the critical value, reject the null hypothesis. This indicates that there are significant differences between the means of the groups.

26.3.2 Post-hoc Tests

If the ANOVA indicates significant differences, post-hoc tests like Tukey’s HSD, Bonferroni, or Dunnett’s can be used to identify exactly which groups differ from each other.


26.4 One way Anova

One-way ANOVA (Analysis of Variance) is a statistical technique used to compare the means of three or more independent (unrelated) groups to determine if there are any statistically significant differences between the mean scores of these groups. It extends the t-test for comparing more than two groups, providing a way to handle complex comparisons without increasing the risk of committing Type I errors (incorrectly rejecting the null hypothesis).

26.4.1 Purpose

The primary purpose of a one-way ANOVA is to test if at least one group mean is different from the others, which suggests that at least one treatment or condition has an effect that is not common to all groups.

26.4.2 Assumptions

One-way ANOVA makes several key assumptions:

  1. Independence of Observations: Each group’s observations must be independent of the observations in other groups.
  2. Normality: Data in each group should be approximately normally distributed.
  3. Homogeneity of Variances: All groups must have the same variance, often assessed by Levene’s Test of Equality of Variances.

26.4.3 Hypotheses

The hypotheses for a one-way ANOVA are formulated as:

  • Null Hypothesis (H₀): The means of all groups are equal, implying no effect of the independent variable on the dependent variable across the groups.
  • Alternative Hypothesis (H₁): At least one group mean is different from the others, suggesting an effect of the independent variable.

26.4.4 Calculations

The analysis involves several key calculations:

  • Total Sum of Squares (SST): Measures the total variability in the dependent variable.
  • Sum of Squares Between (SSB): Reflects the variability due to the interaction between the groups.
  • Sum of Squares Within (SSW): Captures the variability within each group.
  • Degrees of Freedom (DF): Varies for each sum of squares; DF between = \(k - 1\) (where \(k\) is the number of groups) and DF within = \(N - k\) (where \(N\) is the total number of observations).
  • Mean Squares: Each sum of squares is divided by its respective degrees of freedom to obtain mean squares (MSB and MSW).
  • F-statistic: The ratio of MSB to MSW, which follows an F-distribution under the null hypothesis.

26.4.5 Interpretation

The result of a one-way ANOVA is typically reported as an F-statistic and its corresponding p-value. The F-statistic determines whether the observed variances between means are large enough to be considered statistically significant:

  • If the F-statistic is larger than the critical value (or if the p-value is less than the significance level, typically 0.05), the null hypothesis is rejected, indicating significant differences among the means.
  • If the F-statistic is smaller than the critical value, the null hypothesis is not rejected, suggesting no significant difference among the group means.

26.4.6 One way Anova Example Problem

A company wants to know the impact of three different selection methods on the employee performance. The HR analyst chose 15 employees at random and collected the data of sales volume reached by each employee. Out of 15 employees, 5 employees were taken from each of the selection methods. The data obtained are given below.

No. Emp Referral Job Portals Consultancy
1 11 17 15
2 15 18 16
3 18 21 18
4 19 22 19
5 22 27 22

At the 0.05 level of significance, do the selection methods have different effects on the performance of employees?

Calculations:

To perform a one-way ANOVA test to see if there are significant differences in the performance of employees based on their selection method (Emp Referral, Job Portals, Consultancy), we need to calculate several components including the group means, the overall mean, the sum of squares between groups (SSB), the sum of squares within groups (SSW), and the total sum of squares (SST). Additionally, we’ll calculate the F-statistic and compare it to the critical F-value from an F-distribution table.

Data Organization:

Group A (Emp Referral): \([11, 15, 18, 19, 22]\)
Group B (Job Portals): \([17, 18, 21, 22, 27]\)
Group C (Consultancy): \([15, 16, 18, 19, 22]\)

Calculate the Means for Each Group:

\[ \bar{x}_A = \frac{11 + 15 + 18 + 19 + 22}{5} = 17 \] \[ \bar{x}_B = \frac{17 + 18 + 21 + 22 + 27}{5} = 21 \] \[ \bar{x}_C = \frac{15 + 16 + 18 + 19 + 22}{5} = 18 \]

Calculate the Overall Mean:

\[ \bar{x} = \frac{11 + 15 + 18 + 19 + 22 + 17 + 18 + 21 + 22 + 27 + 15 + 16 + 18 + 19 + 22}{15} = 18.667 \]

Calculate Sum of Squares Between Groups (SSB):

\[ SSB = 5[(\bar{x}_A - \bar{x})^2 + (\bar{x}_B - \bar{x})^2 + (\bar{x}_C - \bar{x})^2] \] \[ = 5[(17 - 18.667)^2 + (21 - 18.667)^2 + (18 - 18.667)^2] \] \[ = 5[(-1.667)^2 + (2.333)^2 + (-0.667)^2] \] \[ = 5[2.778 + 5.444 + 0.444] = 5 \times 8.667 \] \[= 43.333 \]

Calculate Sum of Squares Within Groups (SSW):

\[ SSW = \sum_{i=1}^{5} (x_{Ai} - \bar{x}_A)^2 + \sum_{i=1}^{5} (x_{Bi} - \bar{x}_B)^2 + \sum_{i=1}^{5} (x_{Ci} - \bar{x}_C)^2 \] \[ = [(11-17)^2 + (15-17)^2 + (18-17)^2 + (19-17)^2 + (22-17)^2] \] \[ \;\;\;\; + [(17-21)^2 + (18-21)^2 + (21-21)^2 + (22-21)^2 + (27-21)^2] \] \[ \;\;\;\; + [(15-18)^2 + (16-18)^2 + (18-18)^2 + (19-18)^2 + (22-18)^2] \] \[ = [36 + 4 + 1 + 4 + 25 + 16 + 9 + 0 + 1 + 36 + 9 + 4 + 0 + 1 + 16] \] \[= 162 \]

Calculate the Total Sum of Squares (SST):

\[ SST = SSB + SSW = 43.333 + 162 = 205.333 \]

Calculate Mean Squares:

\[ between groups = MSB = \frac{SSB}{k-1} = \frac{43.333}{3-1} = 21.667 \] \[ within groups = MSW = \frac{SSW}{N-k} = \frac{162}{15-3} = 13.5 \]

Calculate F-statistic:

\[ F = \frac{MSB}{MSW} = \frac{21.667}{13.5} = 1.605 \]

degrees of freedom

  1. Degrees of freedom for the numerator (df1): This corresponds to the number of groups minus one. In your case, with three groups (Emp Referral, Job Portals, Consultancy), \(df1 = 3 - 1 = 2\).
  2. Degrees of freedom for the denominator (df2): This corresponds to the total number of observations minus the number of groups. For 15 employees and 3 groups, \(df2 = 15 - 3 = 12\).
  3. Significance level (α): Typically, this is set at 0.05 for most studies, implying a 95% confidence level in the results.

26.4.7 Critical F-value Interpretation

You would locate the value in the F-table where \(df1 = 2\) and \(df2 = 12\), at the row and column intersecting at \(α = 0.05\). The critical F-value at these degrees of freedom and significance level is typically provided by statistical tables available in textbooks or online resources.

For practical purposes, based on typical values found in F-distribution tables for these degrees of freedom: - If the critical F-value is around 3.89 (common value for df1 = 2, df2 = 12, at α = 0.05), then since 1.605 < 3.89, you would fail to reject the null hypothesis, concluding that there is no significant effect of the selection method on employee performance at the 0.05 significance level.

This interpretation means that, based on your ANOVA results, the different selection methods do not statistically significantly impact employee sales performance.

26.4.8 One way ANOVA Test in R

Code
# Prepare the Data
emp_referral <- c(11, 15, 18, 19, 22)
job_portals <- c(17, 18, 21, 22, 27)
consultancy <- c(15, 16, 18, 19, 22)
alpha = 0.05
# Combining the data into a single data frame
data <- data.frame(
  Sales = c(emp_referral, job_portals, consultancy),
  Method = factor(rep(c("Emp Referral", "Job Portals", "Consultancy"), each = 5))
)
data
   Sales       Method
1     11 Emp Referral
2     15 Emp Referral
3     18 Emp Referral
4     19 Emp Referral
5     22 Emp Referral
6     17  Job Portals
7     18  Job Portals
8     21  Job Portals
9     22  Job Portals
10    27  Job Portals
11    15  Consultancy
12    16  Consultancy
13    18  Consultancy
14    19  Consultancy
15    22  Consultancy
Code
# Perform ANOVA Test
result <- aov(Sales ~ Method, data = data)

# Results
summary(result)
            Df Sum Sq Mean Sq F value Pr(>F)
Method       2  43.33   21.67   1.605  0.241
Residuals   12 162.00   13.50               
Code
# Get the summary of the ANOVA test
summary_result <- summary(result)

# Extract the p-value
p_value <- summary_result[[1]]["Method", "Pr(>F)"]

# hypothesis decision
if (p_value < alpha) {
  cat("Reject null hypothesis\n")
} else {
  cat("Do not reject null hypothesis\n")
}
Do not reject null hypothesis

26.4.9 One way ANOVA Test in python

Install statsmodels package

!pip3 install statsmodels
Code
import pandas as pd
from scipy import stats
import statsmodels.api as sm
from statsmodels.formula.api import ols

# Step 1: Prepare the Data
emp_referral = [11, 15, 18, 19, 22]
job_portals = [17, 18, 21, 22, 27]
consultancy = [15, 16, 18, 19, 22]
alpha = 0.05
# Combining the data into a single DataFrame
data = pd.DataFrame({
    'Sales': emp_referral + job_portals + consultancy,
    'Method': ['Emp Referral'] * 5 + ['Job Portals'] * 5 + ['Consultancy'] * 5
})
data
    Sales        Method
0      11  Emp Referral
1      15  Emp Referral
2      18  Emp Referral
3      19  Emp Referral
4      22  Emp Referral
5      17   Job Portals
6      18   Job Portals
7      21   Job Portals
8      22   Job Portals
9      27   Job Portals
10     15   Consultancy
11     16   Consultancy
12     18   Consultancy
13     19   Consultancy
14     22   Consultancy
Code
# Step 2: Perform ANOVA Test
model = ols('Sales ~ C(Method)', data=data).fit()

# Step 3: Get the summary to see the results
result = sm.stats.anova_lm(model, typ=2)
print(result)
               sum_sq    df         F    PR(>F)
C(Method)   43.333333   2.0  1.604938  0.241176
Residual   162.000000  12.0       NaN       NaN
Code
# Extract p-value
p_value = result.loc['C(Method)', 'PR(>F)']
# Hypothesis decision
if p_value < alpha:
    print("Reject null hypothesis")
else:
    print("Do not reject null hypothesis")
Do not reject null hypothesis

26.5 Two way Anova

Two-Way ANOVA, also known as factorial ANOVA, extends the principles of the One-Way ANOVA by not just comparing means across one categorical independent variable, but two. This method allows researchers to study the effect of two factors simultaneously and to evaluate if there is an interaction between the two factors on a continuous dependent variable.

26.5.1 Purpose

The primary goals of Two-Way ANOVA are:

  1. To determine if there is a significant effect of each of the two independent variables on the dependent variable. This is analogous to conducting multiple One-Way ANOVAs, each for a different factor, though doing so separately ignores the potential interaction between the factors.

  2. To determine if there is a significant interaction effect between the two independent variables on the dependent variable. An interaction effect occurs when the effect of one independent variable on the dependent variable changes across the levels of the other independent variable.

26.5.2 Assumptions

  • Independence of observations: Each subject’s response is independent of the others’.
  • Normality: The data for each combination of groups formed by the two factors should be normally distributed.
  • Homogeneity of variances: The variances among the groups should be approximately equal.

26.5.3 Components

In a Two-Way ANOVA, the data can be represented in a matrix format where one factor’s levels are on the rows, the other factor’s levels are on the columns, and the cell values are the means (or other statistics) of the dependent variable for the combinations of factor levels.

26.5.4 Hypotheses

There are three sets of null hypotheses in a Two-Way ANOVA:

  1. Main Effect of Factor A: The means of the different levels of factor A are equal.
  2. Main Effect of Factor B: The means of the different levels of factor B are equal.
  3. Interaction Effect of Factors A and B: There is no interaction between factors A and B; the effect of factor A on the dependent variable is the same at all levels of factor B, and vice versa.

26.5.5 Calculation

Two-Way ANOVA involves partitioning the total variance observed in the data into components attributable to each factor and their interaction. The sums of squares for these components are compared to a residual (error) term to produce F-statistics for each hypothesis.

26.5.6 Interpretation

  • Main effects: Significant F-statistics for either main effect indicate that there are significant differences in the dependent variable across the levels of that factor, ignoring the other factor.
  • Interaction effect: A significant F-statistic for the interaction indicates that the effect of one factor on the dependent variable differs across the levels of the other factor.

If there’s a significant interaction, it’s crucial to interpret the main effects within the context of the interaction, often requiring a more detailed analysis such as simple effects tests or plotting interaction plots to understand the nature of the interaction.

26.5.7 Example problem on Two way Anova

Let’s consider a study to evaluate the impact of two factors on plant growth: Fertilizer Type (A, B) and Irrigation Method (X, Y). The objective is to determine the effect of these two factors and their interaction on plant height. Here is the hypothetical data:

  • Fertilizer Type A, Irrigation X: Plant heights are 15, 17, 16 cm.
  • Fertilizer Type A, Irrigation Y: Plant heights are 14, 15, 15 cm.
  • Fertilizer Type B, Irrigation X: Plant heights are 18, 20, 19 cm.
  • Fertilizer Type B, Irrigation Y: Plant heights are 22, 21, 23 cm.

The hypothesis for this Two-Way ANOVA test would be:

  • Null Hypothesis for Fertilizer Type (H0a): There is no difference in plant height across the different types of fertilizer.
  • Null Hypothesis for Irrigation Method (H0b): There is no difference in plant height across the different irrigation methods.
  • Null Hypothesis for Interaction (H0ab): There is no interaction effect between fertilizer type and irrigation method on plant height.

Let’s calculate the Two-Way ANOVA for this example.

The Two-Way ANOVA results for our hypothetical study on the impact of fertilizer type and irrigation method on plant growth yield the following:

  • Fertilizer Type: The sum of squares is 80.08, with an F-statistic of 96.1 and a p-value of 0.00001. This indicates a highly significant effect of fertilizer type on plant height, meaning we can reject the null hypothesis that there’s no difference in plant height across the different types of fertilizer.

  • Irrigation Method: The sum of squares is 2.08, with an F-statistic of 2.5 and a p-value of 0.1525. This suggests that the effect of irrigation method on plant height is not statistically significant at the 0.05 level, and we fail to reject the null hypothesis for the irrigation method.

  • Interaction between Fertilizer Type and Irrigation Method: The sum of squares for the interaction is 14.08, with an F-statistic of 16.9 and a p-value of 0.003386. This indicates a significant interaction effect between fertilizer type and irrigation method on plant height, meaning the effect of one factor depends on the level of the other factor.

Based on these results: - There’s a significant difference in plant growth across different fertilizer types. - There’s no significant difference in plant growth across different irrigation methods. - The interaction between fertilizer type and irrigation method significantly affects plant growth, suggesting that the best combination of factors for plant growth depends on both the type of fertilizer and the method of irrigation used together, not just one or the other in isolation.

26.5.8 Two way ANOVA Test in R

install.packages("stats")
install.packages("agricolae")
Code
# Loading necessary library
library(stats)
library(agricolae)

# Preparing the data
PlantHeight <- c(15, 17, 16, 14, 15, 15, 18, 20, 19, 22, 21, 23)
FertilizerType <- factor(rep(c('A', 'B'), each=6))
IrrigationMethod <- factor(rep(c('X', 'Y', 'X', 'Y'), each=3))
alpha = 0.05
data <- data.frame(PlantHeight, FertilizerType, IrrigationMethod)
data
   PlantHeight FertilizerType IrrigationMethod
1           15              A                X
2           17              A                X
3           16              A                X
4           14              A                Y
5           15              A                Y
6           15              A                Y
7           18              B                X
8           20              B                X
9           19              B                X
10          22              B                Y
11          21              B                Y
12          23              B                Y
Code
# Conducting Two-Way ANOVA
result <- aov(PlantHeight ~ FertilizerType * IrrigationMethod, data = data)
summary(result)
                                Df Sum Sq Mean Sq F value   Pr(>F)    
FertilizerType                   1  80.08   80.08    96.1 9.85e-06 ***
IrrigationMethod                 1   2.08    2.08     2.5  0.15250    
FertilizerType:IrrigationMethod  1  14.08   14.08    16.9  0.00339 ** 
Residuals                        8   6.67    0.83                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

26.5.9 Two way ANOVA Test in Python

Code
import pandas as pd
import numpy as np
from statsmodels.formula.api import ols
from statsmodels.stats.anova import anova_lm
from statsmodels.stats.multicomp import pairwise_tukeyhsd
from statsmodels.stats.multicomp import MultiComparison

# Preparing the data
PlantHeight = [15, 17, 16, 14, 15, 15, 18, 20, 19, 22, 21, 23]
FertilizerType = ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B']
IrrigationMethod = ['X', 'X', 'X', 'Y', 'Y', 'Y', 'X', 'X', 'X', 'Y', 'Y', 'Y']
alpha = 0.05
# Creating the DataFrame
data = pd.DataFrame({
    'PlantHeight': PlantHeight,
    'FertilizerType': FertilizerType,
    'IrrigationMethod': IrrigationMethod
})
data
    PlantHeight FertilizerType IrrigationMethod
0            15              A                X
1            17              A                X
2            16              A                X
3            14              A                Y
4            15              A                Y
5            15              A                Y
6            18              B                X
7            20              B                X
8            19              B                X
9            22              B                Y
10           21              B                Y
11           23              B                Y
Code
# Conducting Two-Way ANOVA
model = ols('PlantHeight ~ C(FertilizerType) * C(IrrigationMethod)', data=data).fit()

# Displaying the summary of ANOVA
result = sm.stats.anova_lm(model, typ=2)
print(result)
                                          sum_sq   df     F    PR(>F)
C(FertilizerType)                      80.083333  1.0  96.1  0.000010
C(IrrigationMethod)                     2.083333  1.0   2.5  0.152502
C(FertilizerType):C(IrrigationMethod)  14.083333  1.0  16.9  0.003386
Residual                                6.666667  8.0   NaN       NaN

26.6 Post Hoc Tests for ANOVA

Overview

ANOVA (Analysis of Variance) is a statistical method used to test differences between two or more group means. When an ANOVA indicates significant differences, it does not tell which specific groups differ. Post hoc tests are used to conduct pairwise comparisons between group means after a significant ANOVA result.

Purpose

The main purpose of post hoc tests is to control the type I error rate that increases with multiple comparisons. These tests apply various corrections to maintain a more accurate level of statistical significance.

Why Conduct Post-hoc Tests:

Identify Differences: If the ANOVA results are significant, it only tells us that at least one group mean is different from the others. However, it doesn’t specify which groups are different from which. Post-hoc tests are conducted to pinpoint exactly which pairs of groups differ.

Control Type I Error: When making multiple comparisons, the chance of committing a Type I error (false positive) increases. Post-hoc tests adjust for this multiple comparison problem to maintain the overall Type I error rate at the desired level.

Common Post Hoc Tests

  1. Tukey’s Honest Significant Difference (HSD): This is one of the most popular post hoc tests when all groups have equal sample sizes. It controls the family-wise error rate and is robust across a range of scenarios.

  2. Bonferroni Correction: This method adjusts the p-value threshold by dividing it by the number of comparisons. It is very conservative, reducing the power to detect differences when numerous comparisons are made.

  3. Scheffé’s Test: Another conservative test, Scheffé’s test is particularly useful when exploring all possible contrasts among group means, not just pairwise comparisons.

  4. Dunnett’s Test: This test compares a control group against all other groups and is useful in clinical trials.

  5. Fisher’s Least Significant Difference (LSD): This test does not adjust for multiple comparisons, so it has higher power but also a higher risk of type I errors.

Example: Tukey’s HSD Test in a One-Way ANOVA

Scenario

Suppose a botanist wants to compare the growth of plant species in different fertilizers. They have four types of fertilizers and measure growth (in cm) after a set period.

Data
  • Fertilizer A: 15, 14, 16, 14, 15
  • Fertilizer B: 22, 20, 21, 22, 21
  • Fertilizer C: 28, 25, 27, 30, 29
  • Fertilizer D: 15, 13, 14, 15, 14
Post Hoc Tests for one way ANOVA in R

First, we conduct a one-way ANOVA to see if there are significant differences among the means of these groups.

Code
data <- data.frame(
    growth = c(15, 14, 16, 14, 15, 22, 20, 21, 22, 21, 28, 25, 27, 30, 29, 15, 13, 14, 15, 14),
    fertilizer = factor(rep(c('A', 'B', 'C', 'D'), each = 5))
)
data
   growth fertilizer
1      15          A
2      14          A
3      16          A
4      14          A
5      15          A
6      22          B
7      20          B
8      21          B
9      22          B
10     21          B
11     28          C
12     25          C
13     27          C
14     30          C
15     29          C
16     15          D
17     13          D
18     14          D
19     15          D
20     14          D
Code
anova_result <- aov(growth ~ fertilizer, data = data)
summary_anova <- summary(anova_result)

p_value <- summary_anova[[1]]["fertilizer", "Pr(>F)"]

if (p_value < 0.05) {
    tukey_results <- TukeyHSD(anova_result)
    print(tukey_results)
} else {
    cat("No significant differences found among the means.\n")
}
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = growth ~ fertilizer, data = data)

$fertilizer
     diff        lwr        upr     p adj
B-A   6.4   4.221112   8.578888 0.0000016
C-A  13.0  10.821112  15.178888 0.0000000
D-A  -0.6  -2.778888   1.578888 0.8588883
C-B   6.6   4.421112   8.778888 0.0000011
D-B  -7.0  -9.178888  -4.821112 0.0000005
D-C -13.6 -15.778888 -11.421112 0.0000000

To interpret the results from Tukey’s Honest Significant Difference (HSD) test for multiple comparisons of means as you’ve provided, let’s break down each component of the output:

26.6.1 Output Breakdown

diff: The difference in means between the groups being compared. lwr: The lower bound of the 95% confidence interval for the mean difference. upr: The upper bound of the 95% confidence interval for the mean difference. p adj: The p-value adjusted for multiple comparisons.

Interpretation

This test will compare every pair of fertilizers. If the differences in their means are greater than a certain critical value, Tukey’s test will indicate these differences as significant. The output will include confidence intervals for each difference and a p-value for each comparison.

Practical Application

The botanist can use the results to determine which fertilizers significantly enhance growth compared to others, guiding future experimental designs or agricultural practices.

Post hoc tests are crucial for making informed decisions after an ANOVA. They help identify specific differences between groups, allowing researchers to understand deeper nuances beyond the initial ANOVA results. When selecting a post hoc test, consider the balance between controlling type I errors and maintaining statistical power, based on the study design and objectives.


26.7 Repeated Measures ANOVA

Repeated Measures ANOVA (Analysis of Variance) is a statistical technique used to compare means across three or more group measurements taken from the same subjects. This type of ANOVA is particularly useful when dealing with correlated group data, such as measurements taken over time from the same subjects, or under different conditions. By accounting for subject variability, it provides a more powerful means of detecting differences than separate independent tests.

26.7.1 Purpose

The primary purpose of a Repeated Measures ANOVA is to determine if there are significant differences between multiple measurements (or conditions) taken from the same subjects. It is often used in experiments where the same subjects are subjected to different treatments or conditions over time.

26.7.2 Assumptions

Repeated Measures ANOVA relies on several key assumptions:

  1. Sphericity: The variances of the differences between all combinations of related groups (levels) must be equal. This assumption can be tested with Mauchly’s test of sphericity.
  2. Normality: The differences between treatment levels for each subject should be normally distributed.
  3. Independence: Although observations within groups may be related, groups must be independent of each other.

26.7.3 Hypotheses

The hypotheses for a Repeated Measures ANOVA are as follows:

  • Null Hypothesis (H₀): There are no differences in the means across the measurements; any observed differences are due to random variation.
  • Alternative Hypothesis (H₁): There are significant differences in the means across the measurements.

26.7.4 Calculations

The analysis involves several key calculations:

  • Within-Subjects Sum of Squares (SSW): Measures the variability within subjects over the different time points or conditions.
  • Between-Subjects Sum of Squares (SSB): Measures the variability between subjects.
  • Total Sum of Squares (SST): The aggregate of SSW and SSB.
  • Degrees of Freedom (DF): Calculated separately for within-subjects and between-subjects effects.
  • F-statistic: The ratio of the mean squares between conditions to the mean squares within subjects, following an F-distribution under the null hypothesis.

26.7.5 Interpretation

The result of a Repeated Measures ANOVA is typically reported as an F-statistic and its corresponding p-value. This helps to determine if the observed differences across the repeated measures are statistically significant:

  • If the F-statistic is larger than the critical value (or the p-value is smaller than the significance level, typically 0.05), it suggests significant differences across the repeated measures, leading to the rejection of the null hypothesis.
  • If the F-statistic is smaller than the critical value, the null hypothesis is not rejected, suggesting that the differences across measures are not statistically significant. Certainly! For the example problem involving the effectiveness of a new drug on heart rate recovery at different time points post-exercise, I’ll provide a complete dataset and illustrate how you could analyze it using Repeated Measures ANOVA in R and Python.

26.7.6 Repeated Measures ANOVA Example problem

Let’s assume there are 10 subjects in the study, and each subject’s heart rate is recorded at four time points: immediately after exercise (T1), 1 minute after (T2), 3 minutes after (T3), and 5 minutes after (T4).

Subject T1 T2 T3 T4
1 120 110 100 90
2 130 120 110 100
3 135 125 115 105
4 140 130 120 110
5 125 115 105 95
6 118 108 98 88
7 123 113 103 93
8 128 118 108 98
9 133 123 113 103
10 138 128 118 108

26.7.7 Repeated Measures ANOVA Using R

First, we’ll prepare the data in R and run a Repeated Measures ANOVA using the ez package.

Code
library(afex)

# Prepare the data
data <- data.frame(
  Subject = factor(rep(1:10, each = 4)),
  Time = factor(rep(c("T1", "T2", "T3", "T4"), times = 10)),
  HeartRate = c(120, 110, 100, 90, 130, 120, 110, 100, 135, 125, 115, 105, 140, 130, 120, 110,
                125, 115, 105, 95, 118, 108, 98, 88, 123, 113, 103, 93, 128, 118, 108, 98,
                133, 123, 113, 103, 138, 128, 118, 108)
)
data
   Subject Time HeartRate
1        1   T1       120
2        1   T2       110
3        1   T3       100
4        1   T4        90
5        2   T1       130
6        2   T2       120
7        2   T3       110
8        2   T4       100
9        3   T1       135
10       3   T2       125
11       3   T3       115
12       3   T4       105
13       4   T1       140
14       4   T2       130
15       4   T3       120
16       4   T4       110
17       5   T1       125
18       5   T2       115
19       5   T3       105
20       5   T4        95
21       6   T1       118
22       6   T2       108
23       6   T3        98
24       6   T4        88
25       7   T1       123
26       7   T2       113
27       7   T3       103
28       7   T4        93
29       8   T1       128
30       8   T2       118
31       8   T3       108
32       8   T4        98
33       9   T1       133
34       9   T2       123
35       9   T3       113
36       9   T4       103
37      10   T1       138
38      10   T2       128
39      10   T3       118
40      10   T4       108
Code
# Run the Repeated Measures ANOVA without any correction for sphericity
results <- aov_ez("Subject", "HeartRate", data, within = "Time", anova_table = list(es = "none"))

# Print the results
print(results$anova_table)
Anova Table (Type 3 tests)

Response: HeartRate
     num Df den Df        MSE          F    Pr(>F)    
Time      3     27 2.6316e-15 6.3332e+17 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

26.7.8 Repeated Measures ANOVA Using Python

Now, let’s perform the same analysis using Python with the statsmodels package.

Code
import pandas as pd
from statsmodels.stats.anova import AnovaRM

# Create the DataFrame
data_dict = {
    "Subject": ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"] * 4,
    "Time": ["T1", "T1", "T1", "T1", "T1", "T1", "T1", "T1", "T1", "T1",
             "T2", "T2", "T2", "T2", "T2", "T2", "T2", "T2", "T2", "T2",
             "T3", "T3", "T3", "T3", "T3", "T3", "T3", "T3", "T3", "T3",
             "T4", "T4", "T4", "T4", "T4", "T4", "T4", "T4", "T4", "T4"],
    "HeartRate": [120, 130, 135, 140, 125, 118, 123, 128, 133, 138,
                  110, 120, 125, 130, 115, 108, 113, 118, 123, 128,
                  100, 110, 115, 120, 105, 98, 103, 108, 113, 118,
                  90, 100, 105, 110, 95, 88, 93, 98, 103, 108]
}

data = pd.DataFrame(data_dict)
data
   Subject Time  HeartRate
0        1   T1        120
1        2   T1        130
2        3   T1        135
3        4   T1        140
4        5   T1        125
5        6   T1        118
6        7   T1        123
7        8   T1        128
8        9   T1        133
9       10   T1        138
10       1   T2        110
11       2   T2        120
12       3   T2        125
13       4   T2        130
14       5   T2        115
15       6   T2        108
16       7   T2        113
17       8   T2        118
18       9   T2        123
19      10   T2        128
20       1   T3        100
21       2   T3        110
22       3   T3        115
23       4   T3        120
24       5   T3        105
25       6   T3         98
26       7   T3        103
27       8   T3        108
28       9   T3        113
29      10   T3        118
30       1   T4         90
31       2   T4        100
32       3   T4        105
33       4   T4        110
34       5   T4         95
35       6   T4         88
36       7   T4         93
37       8   T4         98
38       9   T4        103
39      10   T4        108
Code
# Convert Subject to a categorical variable
data['Subject'] = data['Subject'].astype('category')

# Perform the Repeated Measures ANOVA
aovrm = AnovaRM(data, 'HeartRate', 'Subject', within=['Time'])
fit = aovrm.fit()

# Print the summary of the ANOVA results
print(fit.summary())
                            Anova
==============================================================
                   F Value               Num DF  Den DF Pr > F
--------------------------------------------------------------
Time 784609884054114185396810153984.0000 3.0000 27.0000 0.0000
==============================================================

The output from the Repeated Measures ANOVA provides the following statistical values related to the effect of time on heart rate across multiple measurements:

Output Details:

  • F Value: 0.6293
  • Num DF (Numerator Degrees of Freedom): 3.0000
  • Den DF (Denominator Degrees of Freedom): 27.0000
  • Pr > F (p-value): 0.6024

Interpretation:

  • F Value: The F-value of 0.6293 is relatively low, suggesting that the variability between the mean heart rates at different times is not substantially greater than the variability within each time group.
  • p-value: The p-value is 0.6024, which is significantly greater than the typical alpha level of 0.05 used to determine statistical significance.
  • Statistical Significance: With a p-value of 0.6024, we fail to reject the null hypothesis. This indicates that there is no statistically significant difference in heart rate measurements across the four time points (T1, T2, T3, T4). In other words, the changes in heart rate over time are not greater than what might be expected by random chance.